Pointwise mutual information (PMI), or point mutual information, is a measure of association used in information theory and statistics.
Contents |
The PMI of a pair of outcomes x and y belonging to discrete random variables X and Y quantifies the discrepancy between the probability of their coincidence given their joint distribution and the probability of their coincidence given only their individual distributions, assuming independence. Mathematically:
The mutual information (MI) of the random variables X and Y is the expected value of the PMI over all possible outcomes.
The measure is symmetric (). It can take positive or negative values, but is zero if X and Y are independent. PMI maximizes when X and Y are perfectly associated, yielding the following bounds:
Finally, will increase if is fixed but decreases.
Here is an example to illustrate:
x | y | p(x, y) |
---|---|---|
0 | 0 | 0.1 |
0 | 1 | 0.7 |
1 | 0 | 0.15 |
1 | 1 | 0.05 |
Using this table we can marginalize to get the following additional table for the individual distributions:
p(x) | p(y) | |
---|---|---|
0 | .8 | 0.25 |
1 | .2 | 0.75 |
With this example, we can compute four values for . Using base-2 logarithms:
pmi(x=0;y=0) | −1 |
pmi(x=0;y=1) | 0.222392421 |
pmi(x=1;y=0) | 1.584962501 |
pmi(x=1;y=1) | −1.584962501 |
(For reference would then be 0.214170945)
Pointwise Mutual Information has many of the same relationships as the mutual information. In particular,
Where is the self-information, or .
Pointwise mutual information can be normalized between [-1,+1] resulting in -1 (in the limit) for never occurring together, 0 for independence, and +1 for complete co-occurrence.
Pointwise mutual information follows the chain-rule, that is,
This is easily proven by: